Subject: Comments on the CD1.7 draft (1) - given id of 99-xxxx
Author: Robert Jones, email: 100621.553@compuserve.com
References:
1. Committee Draft 1.7 for the proposed revision of ISO 1989:1985, COBOL
standard - PDF version. I use Adobe Acrobat Reader, version 3.
Comments:
1 Title page
There are various horizontal bars that seem extraneous. I don't have
a hardcopy version, so this problem may not apply there.
2 Contents & Annex C.4.2 Recursive and initial programs, pages xiv & 726
This is identified separately in the table of contents (arguably not a
bad idea in this particular case, though inconsistent), but its heading
is used on pages from its start position in Annex 4 to the very end of
Annex 4, even where it is not applicable.
3 Contents, page xiv
Annex F, shouldn't there be some way of indicating the appropriate
letter of this grouping in a manner similar to that of Annex E.
Perhaps all the preliminary headings for the annexes should commence
with their primary letter, e.g. "A Communications facility" as is done
with numbers for the numbered sections, e.g. "16 Standard classes".
4 Conformance, 3.1.6, Reserved words, page 4
Perhaps insert "shall" at the beginning of line 3, to be consistent
with the rest of the sentence.
5 Definitions, 4.197, Floating-point numeric literal, page 17
I think it would be preferable to follow the general form used to
describe a fixed-point numeric literal, for example by adding the
following text to that already present:
"and is expressed as a literal comprising the significand and radix in
that order with no intervening spaces. The significand is expressed in
the same way as a fixed-point numeric literal. The radix is expressed
by the upper or lower case letter "E", followed immediately by a plus
or minus sign, followed immediately by another fixed-point numeric
literal of from one to three numeric digits only.".
If this is considered too long-winded, then perhaps just the first
partial sentence should be added.
6 Definitions, 4.216, Identifier, page 18
Consider whether this is too restrictive? - also see 8.4.2 Identifiers.
Procedures should perhaps also be included, especially functions and
methods, it is arguable that references to these latter are to the
temporary data items that are implied by such references and
conceivably the same could be said for called programs. However, the
term identifier implies identification of any identifiable feature,
resource, service or user defined element of the language rather than
just the data. The term "data-identifier" would be a more accurate
description for the current definition. However, it may be that even
if a revision is considered desirable, it would be better left to the
next standard.
7 Definitions, 4.314, Numeric Function, page 23
Insert a comma between "numeric" and "but".
8 Definitions, 4.xxx, Text-word
It might be beneficial to add this term, perhaps with the following
description derived from "7.1.1.4 Text-words":
"A character string in source or library text that constitutes an
element processed by the text manipulation statements COPY and
REPLACE."
9 Definitions, 4.477, User-defined word, page 31
While true, I don't think it is particularly helpful, at least not on
its own. Perhaps one should add something along the following lines:
"Such words are mainly used as identifiers for data items and
procedures.".
or
"Such words are mainly identifiers used to name the elements and
properties of elements of a program that the user has specified the
characteristics of using the reserved words and other elements of the
language.".
As a general point, I am not sure that level numbers fit very
comfortably within the grouping of elements comprising user-defined
words.
10 Reference Format, 6
I like the idea of concatenation of literals rather than continuation
as expressed by Bryan Randall in 99-0423. I am not keen on his idea of
substituting "AND" for "&". But, while I think that the "&" symbol is
a reasonable way of identifying concatenation in a manner that is
easily understood, one major benefit of using words of several
characters rather than equivalent single character symbols with the
same meaning, is that, when writing and amending programs, it is harder
to create inadvertent additions, deletions, substitutions or
transpositions that can still be syntactically valid. However, in the
case of concatenation I think it is rather difficult to create errors
of this nature with delimited literals, because the concatenation
operator needs to be between closing and opening delimiters, with
spaces for separators.
I think that continuation of COBOL words, literals and picture
character-strings should be phased out of the standard as soon as
possible, ideally as an exception to the usual procedure of making it
obsolete first. Doing so would make the rules for reference format and
the COPY and REPLACE statements much easier to develop and understand.
If it is possible for a compiler to recognise and handle such a
feature, it should be reasonably easy for automatic conversion programs
to do the same when converting continued COBOL words, literals and
picture character-strings. The committee could perhaps even provide a
single-purpose program to do the conversion to make the sudden change
more palatable to users, though it is arguable that it would be better
as part of an automatic conversion program to deal with all problems at
once. As a programmer I have never continued COBOL words, literals or
picture character-strings from one line to another. I have only very
rarely seen literals continued and never seen a COBOL word or picture
character-string continued. When I needed to provide values for large
data items, I always subdivided them into manageably sized FILLER data
items first.
(There are some possible uses of the COPY and REPLACE statements that
it might be difficult to convert, for example replacing large continued
literals. In such cases, it would be necessary for the conversion to
convert the currently matching replaced text for both COPY and REPLACE
statements and the other source and library text so that the literals
are broken and concatenated in the same way, in order that they can
still be able to be matched. Probably, continued literals used as
replacing operands should be flagged for user-intervention.)
11 Coded character sets and reference format
This mainly involves 6, Reference format and 8.1, Character sets.
I think that the note in 6, item 1c should be expanded to emphasise
that the number of bytes used to represent a character may vary between
alphanumeric and national coded character sets and that therefore even
a fixed-form reference format line is variable in length in terms of
the number of bytes needed to represent the characters contained. I
seem to remember that one of the current year's COBOL papers mentioned
this, though I haven't been able to find it again.
I think that this should also be listed in Annex D.2, Substantive
changes not affecting existing programs, perhaps as an item entitled
"Fixed-form reference format line length".
I don't think that the standard specifies that either an alphanumeric
or a national character should occupy an integral number of bytes, I
think that a clear statement one way or the other is highly desirable
in "8.1.1, Computer's coded character set". An amendment could also be
made to the "USAGE clause, 13.16.61.3, general rule 8, fourth line" to
replace "multiple" by "integral multiple". "USAGE clause 13.16.61.3
general rule 7" could be amended to state that each character shall be
represented by an integral number of bytes. "Annex D.1, Substantive
changes potentially affecting existing programs, item 7, Size of
characters for USAGE DISPLAY" states that "The size of a character
represented in USAGE DISPLAY is defined to be the same as the size of
a byte in the architecture of the computer.". The USAGE clause doesn't
make such a statement, see "13.16.61.3 general rule 7".
12 Reference format 6, page 36
Item 1b, perhaps replace "reference format" by "both reference formats"
or by "both fixed-form and free-form reference formats". On the other
hand one could argue that item 1b is superfluous as its contents are
effectively stated in the third sentence of the introductory paragraph.
13 Reference format 6, page 36
Item 1d, perhaps add the word "the" after the first word "For".
14 Reference format 6.1.2, Floating indicators, page 37
Literal continuation indicator, second line, "symbol" is misspelt.
15 Reference format 6, continuation, generally
I think that the terminology for the continuation of lines is
confusing. The first paragraph of "6.2.4 Continuation of lines" states
the general case where no continuation markers are used, but the second
paragraph proceeds to qualify this by defining special meanings for the
terms continuation lines and continued lines and, arguably therefore,
implying that the wider sense of continuation without continuation
indicators does not apply to the first paragraph. The first and second
paragraphs of "6.3.1 Continuation of lines" are similarly confusing.
Maybe adding the text "Additionally," to the beginning of both second
paragraphs would reduce the chance of confusion. Also, it may be
desirable to devise suitable terminology to identify continuation and
continued lines with and without continuation markers.
My comment in item 7 of my earlier paper 99-0519 suffered to some
extent from my misunderstanding of these paragraphs.
Item 1d, perhaps add the sentence "Also each line that is not a
continued line is treated as if it were followed by a separator space."
This would probably be a better solution to the problem identified in
item 7 of my earlier paper 99-0519. Arguably, the last paragraph of
6.2.4, Continuation of lines, covers the position for fixed-form
reference format, though I don't think that it is the best placement,
as people reading the standard would tend not to look at the rules for
continuing lines to find the appropriate rules for non-continued lines.
Rules that require a preceding or following space would I think be
improved by specifying the requirement as "a preceding real or assumed
space" or "a following real or assumed space", e.g.
"Reference format, 6.2.6.1, Comment lines" - see below,
"COPY statement, 7.1.2.2, Syntax rule 2",
"REPLACE statement, 7.1.3.2, Syntax rule 2",
"8.3.2 Separators, generally", maybe an introductory note
referring to the use of assumed spaces would be helpful, perhaps
with a suitable reference to 6 Reference format.
"8.3.2 Separators, item 5, para 6", commencing "The opening
delimiter shall be immediately preceded by a space ..." and also
acknowledging that the last non-blank character of a non-continued line is presumed to be a space,
"8.3.2, item 6 for pseudo-text" similarly,
"8.3.1 Character strings", where a character string commences in
column one of a line, or where it terminates in the last
character position of a line that is not continued.
16 Reference format 6.2.6.1, Comment lines, page 40
Perhaps insert ", which is preceded only by real or assumed spaces" at
the end of the first sentence.
17 Reference format 6.2.7, Debugging lines, page 40
Second line, perhaps replace "any place" by "anywhere".
18 Reference format 6.3.3, Comments, page 41
Third para, I thought that what an implementor puts in a source listing
was implementor dependent. See 7.2.12, LISTING directive and B.1
Implementor-defined language element list, item 115, "LISTING and PAGE
directives (whether and when the compiler produces a listing)".
19 Reference format 6.3.3.1, Comment lines, page 41
Maybe insert "non-blank" before "character-string".
20 Reference format 6.3.3.2, Inline comments, page 41
Maybe insert "non-blank" before "character-strings".
21 Reference format 6.4, Logical conversion, page 42
Review interaction between items 7 and 9.
Maybe add the text "that does not follow a line continued with a
floating literal continuation indicator" to the end of the first line
of item 7.
22 Source text manipulation, 7.1.1.3, Pseudo-text, page 45
I am not sure how pseudo-text is continued. Is pseudo-text everything
between the opening and closing delimiters, irrespective of how many
lines are involved? If so, then rules to that effect would be in
order, possibly in reference format with the other continuation rules.
Is it allowable for either pseudo-text delimiter to appear on its own
on a beginning or terminating line, or should there be rules similar to
those of the continuation of alphanumeric, national and boolean
literals, but without continuation indicators? As I understand it, any
real or assumed spaces immediately following the opening delimiter
would be ignored as would those immediately preceding the closing
delimiter. I think that it would be undesirable to allow the two
characters comprising the pseudo-text delimiter to be split for
continuation and that rules similar to those for literal delimiters
should apply.
It might be considered that some of this is self-evident, but it took
me quite a while to determine this from the various scattered rules of
reference format and the COPY and REPLACE statements.
23 Source text manipulation, 7.1.1.4, Text words, page 46
Item 1, third line, split "ofcontext" into separate words.
24 COPY Statement, 7.1.2.1, General format, page 47
Are "text-name", "library-name", "word", "pseudo-text", "text" and
"partial-word" meta-terms or should they be defined or listed in
"8.3.1.1.1 User defined words"? Should "word" be "COBOL word"?
25 COPY Statement, 7.1.2.1, General format, page 47
I think there is a case for making the use of "literal-1", "literal-2",
"word-n" and "text-n" obsolete. I think it would be fairly easy for an
automatic conversion program to make the change to existing programs.
26 COPY Statement, 7.1.2.2, Syntax rules, page 47
Rule 11, the two instances of the text "qualified-data-name-with-subscripts, reference modification", implies that reference
modification is an identifier. Perhaps it should be rephrased as
"qualified-data-name-with-or-without-subscripts-and-with-or-without-reference-modification".
Line two, perhaps there should be an additional sentence similar to
that commencing with "If subscripting ..." to deal with reference
modification.
27 COPY Statement, 7.1.2.2, Syntax rules, page 48
Rule 12, replace "initiator" by "indicator".
28 COPY Statement, 7.1.2.2, Syntax rules, page 48
Rule 13, Is the maximum length really 322 characters, if so why? It
was the same in COBOL 85.
29 COPY Statement, 7.1.2.2, Syntax rules, page 48
(I now think that the only reason sole commas and semicolons are
disallowed is because of their use as separators in the replacement
process.)
(I still can't make up my mind whether the following is worth
considering.)
Rule 14, I think there is a case for also excluding a sole period as
well as sole separator commas and semicolons. After all, a compiler
would not be able to make much sense of the result. A single separator
space could also be reasonably excluded, as perhaps could any number of
spaces with no other characters. Is the term separator needed in the
rule? Arguably single spaces, parentheses, quotes and apostrophes are
already excluded from having any effect by the specification of
"7.1.1.4 Text-words", though they could still be present.
Consider other single characters that should not be allowed, e.g.
hyphens, ampersands, quotes, apostrophes, plus and minus signs,
relational operators. Should any single characters be allowed? Maybe
only single characters not in the COBOL character repertoire should be
allowed. Should character strings representing single reserved or
context sensitive words be allowed (there may be good cases where such
words should be allowed, though I haven't yet thought of any)? Are
there other multiple groupings of characters that should not be
allowed, for example two colons, two asterisks, and the relational
operators ">=" and "<="?
There is presumably a limit on how much protection against misuse that
the standard should provide. Also, it may be that some replacing
operations need to be able to change certain items on a limited scale
that would not be acceptable if done globally.
30 COPY Statement, 7.1.2.3, General rules, page 48
Rule 7, when using literal-3 and literal-4 are the enclosing literal
delimiters part of the text to be replaced and substituted. I think
that a statement to specify this one way or the other is highly
desirable. Arguably, "7.1.1.4 Text-words" specifies that the opening
and closing delimiters are included, though I still think that rule 7
would be clearer for saying so explicitly when literals are substituted
in this way for pseudo-text.
31 COPY Statement, 7.1.2.3, General rules, page 48
Rule 8c item 1, insert "resultant" after "Each" at the front of the
second sentence to make it clear that this applies to a sequence of
presumed spaces as well.
32 COPY Statement, 7.1.2.3, General rules, page 50
Rules 11 and 14, seem to conflict with respect to the treatment of
comments and blank lines. Would it be better to explicitly specify
both "inline comments" and "comment lines" rather than just "comments"?
33 REPLACE Statement, 7.1.3, Generally
Some of the above comments on the COPY statement also apply to the
REPLACE statement.
34 Character strings, 8.3.1.2.2, Numeric literals, page 88
It would be desirable to be able to use grouping separators in numeric
literals. This would make programs more readable and would reduce the
likelihood of errors by programmers when defining and amending large
numeric literals.
35 General, Floating-point data-items, mainly USAGE clause 13.16.61.3
Consider whether it should be explicitly specified that each such data
item should occupy a whole number of bytes and start at a byte
boundary. I realise that internal representation, alignment and
occupancy is implementor defined, and would agree with that up to a
point, but wonder whether it is intended to allow such data items to be
treated like usage BIT. See "USAGE clause 13.16.61.3, General rules".
Similar comments apply to usage BINARY, COMPUTATIONAL, INDEX, PACKED-DECIMAL, BINARY-xxx, OBJECT-REFERENCE, POINTER and PROGRAM-POINTER.
BIT and NATIONAL are obvious justifiable exceptions, if national
characters can be represented by a non-integral multiple of bytes that
is one or greater.
Does it matter?, but, if not, shouldn't there be similar rules to those
in "8.5.1.6.1.3 Alignment of data items of usage bit", which also could
be similarly applied to usage national.
Also see the provisions of "8.5.1.6 Standard data alignment rules" and
"8.5.1.7 Item alignment for increased efficiency".
36 Culturally-specific, culturally-adaptable and multilingual
applications, C.13.2.4, Locale-based case classification of letters,
page 765
Sixth para or separate text grouping, replace "picture string" by "picture character-string" for consistency with the rest of the standard.